Minimum Classification Error Training of Hidden Conditional Random Fields for Speech and Speaker Recognition
نویسنده
چکیده
Hidden conditional random fields (HCRFs) are derived from the theory of conditional random fields with hidden-state probabilistic framework. It directly models the conditional probability of a label sequence given observations. Compared to hidden Markov models, HCRFs provide a number of benefits in the acoustic modeling of speech signals. Prior works for training on HCRFs were accomplished with gradient descent based algorithms by conditional maximum likelihood criterion. In this paper, we extend that methodology by applying minimum classification error criterion-based training technique on HCRFs. Specifically, we adopt generalized probabilistic descent (GPD)based training algorithm with HCRF framework to improve the discrimination capabilities of acoustic models for speech and speaker recognition. Two tasks including a speaker identification and a Mandarin continuous syllable recognition are applied to evaluate the proposed approach. We present the results on the MAT2000 database and these results confirm that the HCRF/GPD approach has good capabilities for speech recognition and speaker identification regardless of the length of the test and training speech or the presence of noise. We note that the HCRF/GPD enjoys its potential for development in acoustic modeling.
منابع مشابه
Hidden Conditional Neural Fields for Continuous Phoneme Speech Recognition
In this paper, we propose Hidden Conditional Neural Fields (HCNF) for continuous phoneme speech recognition, which are a combination of Hidden Conditional Random Fields (HCRF) and a MultiLayer Perceptron (MLP), and inherit their merits, namely, the discriminative property for sequences from HCRF and the ability to extract non-linear features from an MLP. HCNF can incorporate many types of featu...
متن کاملLarge-margin conditional random fields for single-microphone speech separation
Conditional random field (CRF) formulations for singlemicrophone speech separation are improved by large-margin parameter estimation. Speech sources are represented by acoustic state sequences from speaker-dependent acoustic models. The large-margin technique improves the classification accuracy of acoustic states by reducing generalization error in the training phase. Non-linear mappings inspi...
متن کاملSpeaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Speaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Monotone string-to-string translation for NLU and ASR tasks
Monotone string-to-string translation problems have to be tackled as part of almost all stateof-the-art natural language understanding and large vocabulary continuous speech recognition systems. In this work, two such tasks will be investigated in detail and improved using conditional random fields, namely concept tagging and grapheme-to-phoneme conversion. Concept tagging is usually one of the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Inf. Sci. Eng.
دوره 29 شماره
صفحات -
تاریخ انتشار 2013